24 research outputs found

    Enhancement of Subjective Logic for Semantic Document Analysis Using Hierarchical Document Signature

    Get PDF
    In this paper, an extension of Subjective Logic (SL) is presented which uses semantic information from a document to find 'opinions' about a sentence. This method computes semantic overlap of events (words or sentences) using Hierarchical Document Signature (HDS) and uses it as evidence to formulate SL belief measures to order sentences according to their importance. Stronger the opinion, more is the significance. These significant sentences then form extractive summaries of the document. The experimental results show that summaries generated by this method are more similar to human generated ones have outperformed the baseline summaries on average over all the data sets considered

    Evidence based fuzzy single document analysis

    No full text
    Human beings can extract meaningful information from single documents and can even summarize them depending on their interest. Computers on the other hand are used in increasingly large number of documents to process them. Even with vast number of documents we are drowning in, there will always be need of important documents which occur singly or in small numbers. For example, it is unlikely that a statistically significant number of airplanes will collide with tall buildings. So, to analyze and extract significant information from reports or documents related to this kind of scenario, it requires subjective analysis of data. This thesis uses structural fuzzy technology, subjective logic and higher order singular value decomposition to extract information from single documents, or from a small collection of documents. The idea is to analyze the language and syntax used in the document to remove uncertainty, increase confidence, and improve the reliability of decision-making which can have many applications including in the media and intelligence gathering. This is illustrated through the generation of extractive summaries using these techniques. The results are good, and validated by comparing document summaries using my techniques with human generated summaries and other machine generated summaries. My summaries are more similar to human summaries than the rest, and this is the major result captured in this thesis

    Semantic Hierarchical Document Signature For Determining Sentence Similarity

    No full text
    In this paper, we present a new approach that incorporates semantic information from a document, in the form of Hierarchical Document Signature (HDS), to measure semantic similarity between sentences. Due to variability of expressions of natural language, it is very essential to exploit the semantic properties of a document to accurately identify semantically similar sentences since sentences conveying the same fact or concept may be composed lexically and syntactically different. Inversely, sentences which are lexically common may not necessarily convey the same meaning. This poses a significant impact on many text mining applications performance where sentence-level judgment is involved. Our HDS uses the natural hierarchy of the document and represents it in a modularized form of document level to sentence level, sentence to word level; aggregating similarity components at the lower levels and propagating them to the next higher level to produce the final similarity between sentences. The evaluation of our HDS model has shown that it resembles the decision making process as done by human to a greater extent than different vector space models which only uses 'bag of words' concept

    A Term Association Inference Model for Single Documents: A Stepping Stone for Investigation through Information Extraction

    No full text
    In this paper, we propose a term association model which extracts significant terms as well as the important regions from a single document. This model is a basis for a systematic form of subjective data analysis which captures the notion of relatedness of different discourse structures considered in the document, without having a predefined knowledge-base. This is a paving stone for investigation or security purposes, where possible patterns need to be figured out from a witness statement or a few witness statements. This is unlikely to be possible in predictive data mining where the system can not work efficiently in the absence of existing patterns or large amount of data. This model overcomes the basic drawback of existing language models for choosing significant terms in single documents. We used a text summarization method to validate a part of this work and compare our term significance with a modified version of Salton's [1]

    Fuzzy Word Similarity: A Semantic Approach Using WordNet

    No full text
    In this paper we present a hybrid measure of semantic word similarity using fuzzy inference system which combines both the corpus based distance measures as well as gloss overlap to get the final similarity between two words. We use WordNet as a lexical dictionary to get semantic information about words. We show that this new measure reasonably correlates to human judgments and the average performance is boosted by using triangular membership function in the output

    Performance Enhancement of Hierarchical Document Signature: A Comprehensive Study

    No full text
    Hierarchical Document Signature (HDS) has been successfully applied in document computing to find similarity between different pieces of text [1], [2], [3]; for example sentence-sentence similarity, sentence-phrase similarity. HDS is application specific, it is dependent on different features at different levels. This paper hence presents a comprehensive study of enhancement of the performance of HDS to find semantic sentence similarity by tuning some of its significant features. The experimental results support this and show the optimal conditions at which HDS performs similarly to humans
    corecore